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I. INTRODUCTION 


A. BACKGROUND 

Tr ıı the^Gceneral Accounting Office sent a 
preliminary report to Congress dealing with the acauisition 
of major weapon systems [Ref. l:p. 1]. The GAO reported 
that the Navy had experienced a cost growth of $19 billion 
on twenty-four weapon systems in FY 1971, of which 15 
percent was attributed to poor cost estimation. Inaccurate 
cost estimates for weapon systems can result in program 
delays, cost overruns, acquisition of systems that are not 
the most cost effective, and a lack of taxpayer confidence 
in military leaders, to name only a few of the consequences. 
Congressional concern and a continuing need for better 
planning estimates have made it imperative that new 
techniques be developed and old methods be improved to 
obtain better cost estimates for major weapon system 
Pee Meton and acquisition [Ref. 2:p. 1]. In the area of 
cost estimation, an old technique that continues to be a 
significant tool is the learning curve. 

The first study addressing the learning curve phenomenon 
Was documented by the pioneer of the learning curve, T. P. 
Wright of the Curtiss-Wright Corporation, in his 1936 paper, 
7:7 5 “ PedEing the Cost of Airplanes" [Ref..3:p. 32]. 


Analysis of the data collected for a number of years 


beginning in 1922 concerned the relationship of production 
quantity with cost as measured in direct labor hours. 
Wright claimed that each time the cumulative production 
quantity doubled, the average unit cost for that quantity 
decreased by a constant amount, and that this relationship 
plotted as a straight line on logarithmic paper.  Wright's 


formulation of the learning curve was: 


X cumulative production quantity 


Y : average cost per unit 


€ 
DE factor of cost variation 
a: direct manhour cost for production unit number one 


Based on most of the literature available, it can safely 
be said that the principal factors contributing to “me 
existence of this learning phenomenon include considerably 
more than just operator learning. Conway and Schultz 
[Ref. 4:p. 42] believe that learning in aircraft production 
is influenced by a number of during-production factors 
including: 

l) incentive pay 
2) changes in tooling 


3) design changes 
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4) management learning 
5) volume changes 
6) quality improvements 
The rate of a learning curve is usually described by the 
complement of the reduction achieved vhen the production 
quantity is doubled. This value is usually called the slope 


of the curve end is found: 


S = Yoy/Yy 
E 
- 20 
where 
b: slope of learning curve 
S: fraction to which the cost decreases when production 


quantity doubles 
Wright believed that the cumulative average learning 
phenomenon plotted linearly on logarithmic scales and the 
unit learning curve formulation derived from this cumulative 


equation would be [Ref. 5:p. 266]: 


b 
Yo = aX 
Y. - Ge X 
= axp+l 


ДИ 


So, кҝ -. - (X - J 


a(b + 1) xP as X + © 


Y_: average cost per unit 


C 
Ys total cumulative cost 
Yx: COSEÜ o£ EhRe"XEh Unike 


a,b: parameters of the formulation 

J. R. Crawford, another major contributor toene 
literature and theory of learning curves, disagreed with 
T. P. Wright in the log-linear formulation of the cumulative 
average learning curve [Ref. 6:p. 21]. His disagreement was 
based on the apparently steep slope between early production 
units of the unit learning curve derived from the cumulative 
curve. In Crawford's studies, he described the learning 


phenomenon in what has been termed the unit learning curve: 


Yo: Cost- -of the XEN Ome 


X 

X: cumulative amount of units produced 

a: manhour cost for the first production unit 
pi factor of cost variation 
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The cumulative average cost curve derived from the unit 


Gliese is Ref. 6:p: 2uk: 


Oo 


Ms 


>< 
li 
| 


> 5 DI x?) /X 


x=1 
m0 x 
where 
Yy: cost of the Xth unit 
Y. Cotas Cumulative cost 
Yo" average cost per unit produced 


a,b: formulation parameters 
For years both the unit learning curve and the 
cumulative average learning curve have been used almost 
interchangeably. Womer and Patterson [Ref. 5:p. 266] show 
and conclude this is so because for large values of X, each 
curve is a good approximation for the other. They go on to 
say that a problem arises, however, since learning curves 


are generally formulated on the first few units of output to 
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forecast the cost of an entire production. Even though 
forecasts may be for large values of X, the data used to 
make them are not. Under these circumstances, the estimated 
cumulative average learning curve, for example, may approach 
a unit learning curve, but not necessarily the same unit 
curve that would be approximated from early units. Which 
log-linear learning curve specification to choose, unit or 
cumulative, had, through the years, presented a source for 
inaccurate cost estimation. Although 93 percent of all 
firms utilize Crawford's unit learningmecurve [Ret. 74>. 2 one 
there are sufficient exceptions to the use of this unit 
curve implying experience seems to be the best method for 
choosing a particular model. 

Following World War II, Gardner Carr of the McDonnell 
Aircraft Corporation felt learning curves being represented 
as linear on logarithmic paper was an inaccurate portrayal 
of the learning phenomenom. In his April 1946 article 
(Ref. “8:p. 77], Carr felt that the straight line was 
adequate for overall project statistics but is rarely 
correct for budget or "aeualNcost finding Torpe ESk He 
believed that the cumulative average learning curve was 
S-shaped on the logarithimic scale. Explanations for the 
various segment shapes of this curve are found in a RAND 
report by Asher, “Gost Quantity “Ret onsi 5 


Airframe Industry"  [Ref. 6:p. 28]. 
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Another study which suggested that learning curves do 
not adhere to log-linearity was conducted by the Stanford 
Research Institute following World War II. Prestan ford 
system utilizes the 'B-factor' which, basically, modifies 
mne standardi learning curve for prior experience. The 


formulation of this learning curve is: 


Y = a/ J x + B 


Y: cost per unit in manhours 

a: theoretical first unit cost 

X: cumulative quantity produced 

Вин modification factor 
The effect of this formulation is a concave curve on the 
logarithmic scale. The cost of the first unit is depressed 
and the curve arcs to the standard learning curve [Ref. 7: 
52: 

Further research that deviated from the log-linearity 
hypothesis was conducted. Another perspective of the 
production process is that various departments contribute to 
the overall quantity of direct labor hours. Generally 
speaking, these departments are fabrication, subassembly, 
major, and final assembly. It seems obvious that each 
department contributing to the learning curve would itself 


have its own learning curve. In order for the various 


IS 


departments to have their learning effects sum to an overall 
production process log-linear learning curve, each of the 
department slopes must be identical. In practice, the 
various departments often have different slopes. Summing 
these curves would result in a departure from log-linearity 
and arrive at a convex curve whose slope is bounded by the 
flattest of the component curves. In "Cost EE 
Relationships in the Airframe Industry" [Ref. 6:p. 69], 
Asher uses this argument while conducting a significant 
analysis disputing the- log-linear hypothesis of the 
formulation of the learning curve. In his report, he also 
cites research done previously by P. B. Crouse, G. M. 
Giannini, and P. Guibert supporting his contentions. Asher 
concludes, however, that his study 
- + .». does not discredit the use of the linear progress 
curve sues The linear curve is useful for making 
extrapolations beyond the data range provided the number 
of additional units is small. It is clearly a matter of 
judgement whether or not in a specific instance the linear 


curve is appropriate . . . . If allowable error is 
relatively small, a convex curve resulting from predicting 


each of the component curves separately is probably more 
appropriate. 

Another approach to research in the theory of learning 
curves has involved the inclusion of production rate as an 
explanatory variable in learning curve models. In Alchian's 
1963 article [Ref. 9:p. 679], he cites work done in 1948 


that concluded production rate is not a relevant variable. 


Whereas as results published by Smith [Ref. 10:p. 138], and 
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SuEDePtedebv Етпсоп апа Congelton [Ref. 117p. 92], concluded 
that production rate plays a significant role in explaining 
the effects of learning, other studies with contradictory 
results exist. Womer and Gulledge have produced a consider- 
able literature discussing the effects of production rate 
which resulted in a final report for the Air Force [Ref. 12: 
p. 5] addressing the contradictory results of previous 
research, and they develop a Steg fonet on crud ing 
production rate and the cost-quantity relationship of 
learning curve theory. 

In his article "The Learning Curve: Historical Review 
and Comprehensive Study" [Ref. 13:p. 302], Yelle states that 
Meet ofthe literature in learning curve theory, from its 
inception through the 1960's, has focused on primarily 
military applications in the early years through World War 
II and on industry and business in the more recent years. 
Through the years and various paths that research in this 
wena nas “followed, most of the studies do not reach 
consistent conclusions. The early goals of developing a 
general formulation of the learning curve that could be 
applied to the entire aircraft industry or subsets of it 
were quickly abandoned. Despite the vast amounts of 
literature disputing the log-linear relationship between 
cost and cumulative quantity produced, the unit learning 
curve is still the most widely used formulatign of the 


learning curve used in cost estimation today [Ref. 7:p. "s 
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B. OBJECTIVES 

The preceding pages and references provide a brief 
summary of the research expended on the theory of the 
learning curve over the past half century. The important 
point is the learning phenomenon and the numerous formu- 
lations of this theory in aircraft and other industries has 
been an area of extensive research and continues to be a 
viable tool in the world of OS economics. 

The purpose of this research is to conduct an empirical 
study of still another theoretical reformulation of the 
learning curve. In "Budgets, Contracts, Incentives and 
Costs: A Stylized Nexus", by Boger, Jones and Sontheimer 
[Ref. 14:p. b the cumulative average learning curve is 
reformulated to examine the influence cost forecasting and 
budget formation have on the incentives bearing on the firm 
for cost control. The model developed by Boger et. al., a 
cumulative average learning curve model, and a unit learning 
curve model will be estimated through simple linear and non- 
linear regression techniques using several sets of aircraft 
production data. For each formulation of the learning 
curve, the models resulting from the two fitting techniques 
will be analyzed, validated, and compared. Finally, the 
Boger et. al. model will be compared with the classical 


learning curve models for empirical validation. 
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II. THE MODELS 


A. CUMULATIVE AVERAGE LEARNING CURVE 

The cumulative average learning curve, as discussed 
above, was first formulated by T. P. Wright in the 1930's. 
The log-linear relationship between cumulative production 


quantity and average cost per unit is: 


= b 
Yo = aX 
where 
X: cumulative production quantity 
Yo" average cost per UE 
Es factor of cost@varliet ion 


a: direct manhour cost for first unit 

The cumulative production quantity is usually expressed 
ae an integer number ofèanits produced. “The cestevariable 
is measured in direct manhours expended in the production of 
the cumulative quantity produced. We expect the learning 
curve slope, factor of cost variation, to have a negative 
value when we anticipate the presence of learning in the 
pameduction of some product. This formulation also 
presupposes a relatively constant rate of production and 


uniformity of units produced. Deviations from these last 
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assumptions are recognizable in a plot of the raw data, 


l.e., toe up, toe down, bottom out, scallop. 


BA UNIT LEARNING CURVE 

The unit learning curve, as also discussed above, was 
first formulated by JI: R Or Nr orak He disagreed with 
Wright's log-linear formulation of the cumulative average 
learning curve. Crawford believed the relationship between 
cumulative quantity produced and the cost of the final unit 


of that quantity was log-linear and was formulated as: 


Yo: cost of the final Unni 


X 

2. cumulative quantity produced 

a: direct manhour cost for first unit 
b: factor of cost variation 


The same comments and assumptions concerning the cumulative 


average learning curve apply. 


C. BOGER, JONES, AND SONTHEIMER MODEL 

Boger, Jones, and Sontheimer express the costs of 
production over a time period as opposed to over the 
production of cumulative units regardless of time. They use 
the cumulative average learning curve as the starting point 


In their formulations 
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As discussed above, the typical cumulative average 


learning curve is of the form: 


Y(t) = ag(t)P (1) 


where now 
Y(t): average cost per unit 


Q(t): cumulative quantity of units produced through 
time t 


ED. t learning curve parameters 
mee typical werogmsess function (learning curve) treats the 


inputs as varying continuously and causing a related 


continuous variation in some product (output) [Ref. 14: 
po 23]. From (1) we can derive an expression for total 
cost: 


Q(t) " Y(t) - aQ(t)? Q(t) 


“ozel (2) 


X(t) 


vhere 


X(t): total quantity of inputs consumed by the production 
DEMONE) 


This specification yields the following marginal require- 


ments, dX, for an incremented output, dQ: 


dX _ a(b+ 1702 (3) 
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Now, assume the product emerges in quantities at discrete 
time intervals. That is, we now develop an algorithm using 
the cumulative average learning curve formulation based on 
how many units are produced in a specified time period. In 
application, we assume that progress or cost per quantity is 


proportional to productivity achieved in prior production: 





X 
777: mn a (4) 
where 
dos dQ: amount produced in time period t 
X. = dX: inputs used in time period t 
бу: proportionality constant 


We assume that learning is derived not only from the 
preceding period but from all the production prior to the 


period we are in. So we first set: 


where 


C ONE) 


Substituting (4) we get: 


t-1 b 


= 


5 
* qe 1 





q, = alb + 1)0° q, (5) 


p 


We now let Q, the quantity of units produced up to time t, 
be equal to the quantity of units produced through time 
pemiod t—M Now, substituting into (5): 


"cog 


de-1 


E 





t-1 
b 
q, *ab*1 (8) qil” qt (6) 
yei 


Equation (6) assumes learning in period t is derived only 
from production in period t-l. We assume this relationship 
must hold at previous time periods also. So rewriting (4) 


ema (5) for period t-l, 


X 





- te? = xD 
m "q dg MEDIE UO d.i 
t-2 
where 
Q”: amount of units produced through time period t-2 
Therefore, 
t-2 
b 
SES 
which leads to: 
X t-2 
t-1 b 
* acm Ww» b. 3] 
dei J 
.. 
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and substituting into (6): 


um e 
b b 
ô a(b + l) [ Doc q AT” salio ta 2» qj] di 
j=1 ER 
t-1 b 
> 93 
n j=l E 
pus = t-2 for L = 3 4. =, ° e° °; T (7) 
E ki 
J=1 


Now substituting (7) into (4) we have: 


T 
P 
O” 


M 





rt 
ct fu. 
IU 
NIE 
CT 
Lech 


C 
LI 
Il 
= 
L. 
, 





Since this is true for all time periods, we can say: 


rt 
1 

N 
m 


M 














t-1 p ј=1 6-2 
uc — dr 3 
ӰЗ q. 
ј=1 
== b 
q. 
j 
x j=1 Sce 
- m 3 and so on 
4622 6-3 
q. 
L 5. 
5. 


777 SUEStiCUting recursively we nave: 


tal t-2 t-3 2 
b b b b 
e [ "REM [ qəl [ E [ 2931 
кә — e um j=1 2 
q, t-2 t-3 t-4 : а b q 
b b b T 
[ 2, 93) 293 2,9, 
ј=1 ј=1 jet 
t-l b 
Ы. ca 
J 
a o “2 
t-l b 
i L a; 
t — gel 
de q; 
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where 


Zi direct manhours per quantity produced in second 
time period 


t-1 
2.95: total quantity of units produced prior to present 
S time period 


qi: quantity of units produced on time period one 
bi factor of cost variation 
X. / q. : average cost in direct manhours of units produced 


in time period t 
The length of the time period, although it must remain fixed 
over the data space, can be any length, i.e., day, month, or 
quarter. The quantity produced in a particular time period 
need not be an integer amount although partial units 
produced are generally not found in aircraft production 
data. As in the cumulative average and unit learning curve 
formulations, we expect the factor of cost variation to have 
a negative value. This model also presupposes uniformity 
between production units and also a constant production 


rate. 
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III. DATA 





A. GENERAL 

The dependent variable in each of the models investi- 
gated will involve a cost of some type. In each of our 
models this cost will be measured as a function of direct 
manhours expended in the production of some quantity of 
units. Direct manhours will be defined as those hours spent 
OMNE SbuNcatueocHa assembly, production “flight, and other 
production work associated with the basic aircraft. All 
manhours pertaining to tooling, engineering, planning, 
testing and subcontracting are not included in this 
derınition- It seems obvious that the vay in vhich direct 
manhours are accumulated can, and does, lead to inconsis- 
tencies due to differences in accounting systems from 
geomuractor “to contractor: The use of direct manhours has 
numerous advantages over the use of dollars as a measure of 
cost. In uSing direct manhours, we avoid the additional 
data computations involved in applying price indices to 
transform all dollar costs into constant dollars. We also 
avoid inaccuracies in the data caused by using price indices 
which are inexact figures. Finally, direct manhours is a 
variable comparable over a group of contractors whereas, due 


to differences in wage rates from contractor to contractor, 
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costs measured in dollars are not the best toot ee. 
comparison. 

The data for this report include aircraft production 
data for the C-141 and F-102. The Cc-1417 ҹазибә иса by ane 
Lockheed Corporation and the F-102 was produced by General 
Dynamics. The C-141 program produced 284 aircraft from July 
1962 through April 1968. The C-141 is a large, swept wing, 
4 jet engine cargo transport. The data for this study were 
drawn from Orsini Ref: 5g p EOD Orsini obtained the 
data from C-141 Financial Management Reports prepared by the 
contractor, Lockheed Aircraft Corporation, forme 
Force. The C-141 data provided a large sample of data for 
which a basic model of the aircraft was produced throughout 
the production program. Uniformity between units produced 
is a basic. assumption in the application of the ии 
curve theory. Orsini aggregated the monthly production data 
into quarterly direct manhour production data reducing the 
total number of data points to twenty-four. Orsini ¡Ele 
this quantity vas sufficient for his analysis “end ene 
current research is similarly restric ted The data 
variables used by Orsini and this researcher are: 

1) direct labor hours per lot per month 
2) aircraft per lot 
3) delivery dates of each aircraft 
The F-102 program produced 1000 aircraft from 1953 


through 1958. The F-102 is a single seat, supersonic, delta 
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wing, all-weather fighter. The data for this study was 
drawn from Gulledge and Womer ВЕР: БА"). А 
comprehensive cost breakdown by individual airframe was 
provided by the F-102 Program Cost History" document--the 
source of the Womer and Gulledge data. The F-102 program 
consisted of the production of F-102 airframes and TF-102 
airframes. Rather than delete the TF-102 observations for 
the sake of strict uniformity, these data boim were not 
eliminated since it was assumed that learning was 
experienced in the production of these airframes. As Womer 
and Gulledge note, the total manhours expended per airframe 
can be disaggregated into three parts: details, assemblies, 
and outside-of-factory labor. Tornada rect Cost per 
airframe is comprised of only detail and assembly hours. 
The detail hours are comprised of fabrication hours and 
assembly hours include subassembly, major assembly, primary 
assembly, and final assembly hours. After the portion of 
labor hours expended per airframe outside the factory is 


deleted, the total direct cost per airframe is left. 


B. REFINEMENT 

As already discussed, three models will be utilized in 
Eh l examination of two sets of aircraft production data. 
Parameter estimation for these models require the data to be 
in a particular form for each model. The C-14l production 


data is available for aircraft grouped into production lots 
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and the F-102 production datamusewaevarlable Vion each 
alrframe. Since the models do not each fit the particular 
form of each data set, adjustments and refinements need to 
be made to the data to fit the different learning curve 
formulations. 

l. Cumulative Average Learning Curve 

The data requirements for the cumulative average 
learning curve are rather st uai fac The independent 
variable is the cumulative quantity of aircraft produced. 
The dependent variable is the average amount of direct labor 
hours expended per unit in the production of the cumulative 
quantity produced. The F-102 and C-141 adjusted data used 
to fit the cumulative average learning curve are tabulated 
in Appendix A. 

The composition of the F-102 data consist basically 
of total hours expended in the production of each airframe. 
This data set lends itself to be easily refined to meet the 
data requirements of the cumulative average learning. As 
previously discussed, the F-102 total direct manhours per 
aircraft consisted of three parts: details, assemblies, and 
outside of factory labor. Table I, extracted from Womer and 
Gulledge [Ref. 12:p. 86], provided the information necessary 
to translate the raw data into direct manhours per airframe. 
Since this table only applied to lots four through eleven, 
only these 204 observations were utilized. The ,airframes in 


lots four through eleven were then ordered with respect to 
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Fabrication 


Assembly 


Outside of 
Bac tory 


TABLE I 


PERCENT OF TOTAL MANHOURS ALLOCATED TO 


TEE AGRIVTTTES=BY CONTRACT 


15573 


2 3909 


21.06 


FOR 56 


Sak 


Contract 


29264 


21.25 


Ga. 62 


ko 


31174 


wo. I 


6 62 7 


17.61 


Sooo 


18.47 


id. 52 


EE 


delivery sequence number. It was this sequence--l, 2, 3, 
eee, 204--that provided the independent variable data 
vector. The sequence of cumulative sums of direct manhours 
divided by the cumulative amount of airframes delivered for 
each element of that sequence provided the dependent 
variable data vector. 

The C-141 data were organized into twelve lots. The 
number of units in each lot and the number of direct man- 
hours expended in the production of each lot of airframes is 
provided. The data required for the cumulative average 
learning curve is arrived at through a series of simple 
calculations discussed in the RAND Memorandum "An Intro- 
duction to Equipment Cost Estimating" [Ref. l6:p. 104]. The 
cumulative average hours are computed at the final unit in 
each lot--where the cumulative average hour figures apply. 
Therefore, twelve data points will be used in the parameter 
estimation for the C-141 cumulative average learning curve 
formulation. 

2. Unit Learning Curve 

The data requirements for the unit learning curve 
are also rather straightforward. The independent variable 
is the cumulative quantity of aircraft produced. The 
dependent variable is the amount of direct manhours expended 
in the production of the final unit of ter culete 
quantity produced. The F-102 and C-141 adjusted data used 


to fit the unit learning curve are tabulated in Appendix B. 
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The composition of the F-102 data again tends to be 
easily refined to meet the data requirements of the unit 
learning curve. Table I is used to translate the raw data 
of lots four through eleven into direct manhours per 
airframe. The airframes were then ordered with respect to 
delivery sequence number. It was this sequence of 204 
airframes with each unit's respective direct labor hours 
required for production that are used as the independent and 
dependent variable data vectors for the estimation of the 
parameters of the unit learning curve. 
| Since the C-141 production data are grouped into 
lots, a rather gross approximating technique is required to 
transform the data into the form required by the unit 
learning curve specification. The average number of labor 
hours for each lot is treated as if it were an observation 
on the labor hours required to produce the unit at the lot 
midpoint. When dealing with a log-linear relationship, the 
arithmetic midpoint produces unequal areas under the 
learning curve between the first and last units of each 
respective lot. The exact determination of a true lot 
midpoint depends on the lot quantity, type of curve hypothe- 
Sized, and the true slope of the learning curve [Ref. 16: 
Pea LOS]. In order to avoid the shortcomings of the 
arithmetic midpoint, the algebraic midpoint, K, discussed in 


(Ref. 17:p. 44] will be used: 


SC 


-1/B 


ә m(l + B) 
e (L + A ES 


m: lot quantity 
B: learning curve slope 
Es. Last unttwome tic ro. 
F: CETITSt umnmt of 9*?He de 
An estimate of B from Womer and Patterson's report 
(Ref. 5:p. 267], is used in calculating the algebraic 
me Pouse Again, twelve data points are used in the 
parameter estimation for the C-141 unit learning curve 
specifications. 
3.  Boger, Jones, and Sontheimer Model 
The data requirements for this model are besed on 
the statement regarding the marginal requirements for 
incremental outputs of product produced in Boger, Jones, and 
Sontheimer's paper [Ref. 14:p. 23]. That is, the produce 
emerges in lots or lumps, Dev at discrete intervals using 


discrete inputs, X of the composite resource (direct labor 


E! 
hours). Therefore, the data requirements for this model 
are: quantity of units produced each time period and the 
direct labor hours expended in the production of units 
produced in each time period. 

The complete data base for the F-102 program 
contains total labor hours for each airframe. This data is 


not in the form required for the Boger et. al. model.  Womer 


and Gulledge took considerable care in resolving the data 
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55 cm ın emerr sedy [Ref. 12:p. 85]. Their work made the 
data compatible with the theoretical model they were 
scosınq. The information concerning the F-102 program that 
Womer and Gulledge discuss made it possible to apply some 
further adjustments to establish a data base compatible with 
the Boger et. al. model. 

As discussed before, the ideal data for the Boger 
et. al. model is the total number of aircraft produced in a 
552 -:£fic AEime period, Ger and the quantity of direct labor 


mers, X expended in producing Gy: Although this data 1s 


pr 
not directly available, Womer and Gulledge derived the next 
best alternative--cost by lot per month. Due to non- 
availability of certain information, Womer and Gulledge only 
were able to approximate the cost by lot per month for lots 
four through eleven. 

Tables I, LI, and TII alleng#With the F-102 data base 
in [Ref. 12:pp. 83-85] provided enough information to adjust 
the data for lots four through eleven for use in the Boger 
et. al. model. The first adjustment was to use Table I and 
the total labor hours expended on each airframe in lots four 
through eleven to arrive at values for cumulative fabrica- 
tion and assembly hours for each airframe. As discussed 
earlier, these hours comprise the direct labor hours 


expended for each airframe. The next step was to calculate 


the equivalent airframe units produced per month for each 
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lot. This was calculated by first determining the empirical 


production rates for each lot: 


DÈ DMH £ 


alrcraft 


in lort 
Ye = Да АН fer Lots 4, Sie 6G WO an 


> Dun. 


alrframes 
in lot 


Y M R İZ for 1lot$ 4, 526.) 

Production rate (fab) = 1/Y. 

Production rate (assem) - Ди. 

DMH +: direct manhours for fabrication 

Dun : direct manhours for assembly 
The production rates for fabrication and assembly were then 
applied in conjunction with Tables ee 
cumulative fabrication and assembly hours per month per lot, 
then added to arrive at equivalent aircraft produced per 
month per lot. These results were then summed across lots 
four through eleven for each month appropriately using 
Tables II and III to arrive at equivalent units produced per 
month. Direct labor hours expended per month on the 
equivalent quantity of airframes produced per month was 
Similarly calculated. The adjusted F-102 production data 
per month for lots four through eleven for use in the Boger 


et. al. model is summarized in Appendix C. 
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The original form of the C-141 data made available 
to Orsini by the Air Force Plant Representative Office was 
direct manhours per lot per month expended as direct labor 
hours as defined previously and the quantity of aircraft per 
İsi. Orsini tene aggregated this monthly data into 
quarterly data points and tabulated it as direct manhours 
per lot per quarter. The adjustments made to the data by 
Orsini for his analysis were compatible with the refinements 
required by the Boger et. al. model. Average production 
rate for each lot was first determined by dividing total 
aircraft in each lot by the total amount of direct labor 
hours attributed to the production of each respective lot. 
This average production rate was then applied to the 
tabulated quarterly data to arrive at equivalent units 
produced per lot per quarter. The equivalent units produced 
per lot per quarter and direct labor labor hours per quarter 
were then summed across each lot for the quarters each lot 
was worked on to arrive at equivalent units produced per 
quarter and direct labor hours expended per quarter. The 
data, as refined by Orsini, used in the Boger et. al. model 


is tabulated in Appendix C. 
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IV. METHODOLOGY 


A. LINEAR REGRESSION 

Historically, it has usually been assumed that the 
relationship between the independent and dependent variables 
of a learning curve specification is log-linear. This 
assumption has made it particularly easy to estimate the 
learning curve parameters through simple linear regression 
when only one independent variable is used. In this study, 
the least squares, normal error regression. model is 


utilized. The normal error model is: 


Y: - B EE ME Bor? = ə əə 
vhere 
Yo: b “eh . 
ij observed response of the i er ran 
x. the level of the independent variable in 2. trial 


807515 regression parameters 

E residuals which are distributed N(0, 2. 

Normality of the error terms seems reasonable since the 
residuals probably represent the accumulation of many 
effects that are omitted from the model. The cumulative 


error term, ei, vould tend to comply vith the central limit 


theorem and approach normality. Since the error terms are 
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assumed to be normally distributed, the assumption of no 
correlation between residuals becomes one of independence. 
Still yet, the assumption of normality allows one to perform 
some parametric statistical tests in evaluating the 
statistical significance of the estimated parameters and the 


aptness of the model. 


B. NON-LINEAR REGRESSION 

Non-linear regression software in STATGRAPHICS [Ref. 18: 
pp. 19-35] is used as an alternative method of parameter 
estimation. In this procedure, least squares estimates of 
the parameters of a non-linear model are determined. The 
learning curve formulations in this study are inherently 
non-linear when the data are in their raw form. The non- 


linear model is: 


X; = әл. ә. forse = gl, 2 —.- 
where E 
Yi: observed response of the i traal 
Xi: level of the independent variable of ¡Eb tre 
a,b: regression parameters 
Eege residuals which are distributed N(0, o”) 


The non-linear regression method utilized in the 
STATGRAPHICS software was developed by D. W. Marquardt and 
represents a compromise between the linearization (Taylor 


series) method and the steepest descent method of non-linear 
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parameter estimation. Marquardt's compromise has been 
described as combining the best features of the lineariza- 
tion and steepest descent methods while avoiding their most 
serious limitations. A detailed discussion and references 
for this algorithm are contained in Draper and Smith's 
Applied Regression Analysis, Second Edition Mb CHR NND 
p. 471]. An important aspect of non-linear regression that 
deviates from the linear case is worth en re When the 
error term of the non-linear model is assumed to be normally 
distributed, the parameter estimates are no longer normally 
distributed and the sample residual variance is no longer an 
unbiased estimate of the residual variance. While suitable 
comparison of mean squares can be made visually, the usual 
F-tests for regression and lack of fit are not valid, in 


general, for the non-linear case [Ref. 19:p. 484]. 


C DATA ANALYSIS 

Examination of the observed residuals of a regression 
model is an important aspect of any PE sau technique. 
If the model is appropriate, the observed residuals should 
reflect the properties assumed for the error term in the 
regression model. In this study, both graphical and 
Statistical tests involving the residuals will be performed. 
Evaluation of the residuals of the various models to be 
considered will address possible departures from the model 


Pac luda rien the regression model does not hold, the error 
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Fun. 00756 have onstan variance, the error terms are not 
independent, the model fits all but one or a few outliers, 
and the error terms are not normally distributed. 

After fitting a model to the data, residuals falling 
into a horizontal band centered at zero displaying no 
systematic tendencies to be positive or negative and 
appearing to be randomly scattered would suggest the 
assumptions of the model do not appear to be violated. This 
would imply the model is well suited to the data. ENS 
is not the case, remedial measures would need to be taken. 
Generally speaking, there are two types of remedial measures 
that are normally followed: abandon the model altogether or 
use some transformation on the data so the model is appro- 
priate for the transformed data. In this report, only two 
aspects of data transformation will be reckoned with: 
autocorrelation and the handling of outliers. When these 
two problems are dealt with and further residual analysis 
clearly implies the assumptions of the model are not met, 
the model will be rejected. 

1. Autocorrelation 

The regression models of ordinary least squares or 
maximum likelihood techniques consider the stochastic 
disturbance terms, the residuals of the regression, to be 
either uncorrelated or independent normal random variables. 


In the application of regression models to learning curves, 
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we use time series data. The assumption of no correlation 
or independence between error terms for time series data is 
often inappropriate. The observed correlation between 
residuals of regression modeling is called autocorrelation 

or serial correlation. 
Neter and Wasserman outline the problems associated 

with autocorrelation: 
i) The regular least squares regression coefficients are 
still unbiased but no longer have the minimum 


variance property and may be quite inefficient. 


1i) The mean squared error (MSE) may seriously 
underestimate the variance of the error terms. 


iii) The estimated standard deviation of the regression 
coefficients may be seriously underestimated and R 
may be overestimated. 

1V) The confidence intervals and tests using the 
student"s t and F distributions are no longer 
strictly applicable. [Ref. 20:p. 352] 

In this study, the existence of first order auto- 
correlation, AR [1], will be investigated graphically and 
will be statistically tested using the Durbin-Watson test. 
If autocorrelation indeed exists after examination of the 
residuals, this information will be used’ to improve the 
regression model. The autocorrelation will be modeled and 
accounted for in a transformation of the model data. 

The first-order autocorrelation error model 


discussed by Neter and Wasserman [Ref. 20:p. 353] for a 


Simple linear regression is: 
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È 0 pet t 
wg s Pepe E 
where 
pus autocorrelation parameter, lol € 1 


PE independent and distributed N(0, EES 
The following discussion also applies in a nonlinear model 
when the error term is additive. It can be shown that the 


properties of the error terms lead to the following 


Conclusions: 


T E(e,) = 0 
müb. us > ,2s 
s=0 
S o? 
111) cov(e, = ә. 50 
1-0 


These imply the error terms for the first-order autoregres- 
sive model are autocorrelated unless the autocorrelation 
Br meter, P, equals zero [Ref. 20:p. 357]. 

When the autocorrelation parameter, p, is not zero, 
it will be necessary to estimate the value of p for use in 
the autoregressive structure as a source of additional 
information in our regression model. 

Following a graphical inspection of the residuals, 
ua Dirbin Watson test ™ ill be utilized to test the 


hypothesis: 
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Hou implying no autocorrelation 


The test statistic, D, used in this text is: 


n 
E 2 
s 2 57:75 
122 
n 
L e; 
€ 
Í 
1=1 
where 
ei: ER residual of the regression model 
KE number of data points used in the regression 


If we reject the null hypothesis, this test-statistic, D, 
can be used further to estimate. the autocorrels2 nem 
coefficient, p. The estimate of p, Ci, is discussed by 


Neter and Wasserman [Ref. 20:p. 358] and is: 


For sufficiently large n, an alternative estimator of op 


derived by Theil and Nagar [Ref. 21:p. 164] is: 
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HR 
I 
Al 
+ 
| 


where K is the number of parameters to be estimated in the 
regression model. 


When n >> k then 


" D 


Me estimator for the autocorrelation parameter, 0, in 
equations (2) and (3) will be used in this study. 

The iterative method of incorporating the first- 
order autoregressive model into the regression model is used 
and discussed in Neter and Wasserman [Ref. 20:p. 361] and 


ик исә Бәги İRef...21:p...2l641:- The data are first 


transformed: 


2 : 
X. - (a _ rj d X; for WT = 1, 2, or 3 
Y.' = 1 - r.” Y moruq. = ly 2, or 3 
1 j 1 , r , 
Xon = 75 Jt ° X. ) [ONE 72 799 дын ctus 
: š J = HN 125-or 3 
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forma 
J or 3 


lan 
Hp 
li 
Ki 
= 
— 
L. 
Ki 
ps 
| 
— 
= 
c 
lÍ Il 
pa 
N 


The regression is then performed with the transformed data. 


The Durbin-Watson test is then employed to test whether the 
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new residuals for the transformed data are uncorrelated. 
The procedure discussed above continues until the Durbin- 
Watson null hypothesis is accepted. 
2. Outliers 

The presence of outliers can cause some difficulty 
when fitting a model using the least squares method. 
Outliers can either be errant observations or perhaps result 
due to an interaction with a variable that is not included 
in the model. In either case, when outliers exist, those 
particular data points should be addressed. If evidence 
exists that abnormal circumstances surround a particular 
data point, it is safe to discard it. In order to address 
outliers, it is obvious that the analyst must be familiar 
with the data or have the resources to adequately address 
them. In this report, the resources to adequately address 
the nature of outliers does not exist; therefore, residuals 
AS a he [MSE from zero will be designated 
as outliers and rejected but annotated. 

3. Normality of Error Terms 

As discussed by Neter and Wasserman [Ref. 20: p. 
107], small departures from normality do not create any 
serious problems in the fitting of the model. Major 
departures, on the other hand, should be of concern. The 
normality assumption will be graphically addressed through 


probability and symmetry plots. Aà rough statistical Esk 
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addressing normality of the error terms is discussed in 
Neter and Wasserman [Ref. 20:p. 107]. Lis 90 perweent of 
the standardized residuals, e./ [MSE, fall between the 
appropriate standard normal values or the corresponding 
Student's t-values for small sample sizes, the normal 
assumption will not be rejected. 
4.  Homoscedasticity 

The assumption of constant variance of the residuals 
will also be addressed graphically and statistically. 
Residual plots will initially be inspected prior to con- 
ducting a non-parametric rank correlation test between the 
absolute value of the residual and the value of the indepen- 
dent variable aS discussed in Conover [Ref. 22:p. 255]. The 
assumptions of constant variance will be rejected if the 
hypothesis of no correlation is rejected in this non- 


parametric test. 


D. INFERENCES CONCERNING PARAMETER ESTIMATION 

Following verification of the underlying assumptions of 
a simple linear regression, it is of interest to investigate 
the statistical significance of the parameter estimates 


in the model: 


J S Ofinterest to initially test the hypothesis: 
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to see if there is a statistically significant linear rela- 
tionship between the independent and dependent variables. 
It can be shown that if the underlying assumptions of the 
model hold, the parameter estimate of 81r bi, is normally 
distributed [Ref. 20:p. 53]. Therefore, (bi - В1)/5(Ы) 15 
distributed as t(n - 2). Furthermore, the test to decide 
whether 8, is statistically equal to zero is based on the 


test statistic: 


Ti = b1/S (bi) 


The decision rule, of a significance level a, is given by 


Neter and Wasserman as [Ref. 20:p. 61]: 


Accept Ho If | T} | LED a2 – /2| 


Otherwise reject Ho 


Similarly, it can be shown that inferences concerning 80 are 
analogous to those for 81 Вел аб ур кс ТИ. 

The usual tests that are appropriate in the linear model 
are, in general, not appropriate when the model is non- 


linear. Draper and Smith [Ref. 19:p. 484] discuss why this 
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is so and also present a practical procedure that can 
provide a measure of possible lack of fit for a non-linear 
model. In the non-linear case, no statistical tests 
concerning the parameter estimates will be discussed in this 
study. Instead, the results of the non-linear regression 
will only be compared to those of the simple linear 


regression. 


E. VALIDATION 

Since time series data is being used, it is not possible 
to split the developmental data and the validation data 
F andormi v. For each learning curve formulation and the two 
methods of parameter estimation, roughly, the first seventy- 
five percent of the data is used to fit each regression 
model. The remaining data is saved to validate the fore- 
casting ability of the fitted model. While the validation 
5285556 òf modèl building is important, the criteria of the 
validation phase, that is, determining how well a model 
forecasts, is subjective and goodness can vary depending on 
the needs of the user. In this research, several measures 
of forecasting accuracy will be used to quantify model 
results. The measures selected for this analysis are the 
mean percent error (MPE), the mean absolute percent error 
(MAPE) and the Pearson correlation coefficients adjusted for 


degrees of freedom. MPE is defined as: 


əyi 


n 
_ 100 
MPE = > 2 AL n PIMAN 
tel 


MAPE is defined as: 


n 
100 
МРЕ = —— Y lA, - Р.|/А, 
Е=1 
where 
A,: actual or realized value at time t 
PL: prediction of forecast value at time t 


The Pearson correlation coefficients are defined as: 


Var (Cae ao & 
Var(Y)/dof 


ROMAE 


2 


R“ (Validation) = 1 Var (rr)/dof 


TL varye dot 


where 


Var(r): sample variance of the residuals of the fitted 


model 


Var(rr): sample variance of the residuals of the 
values 


Var(Y): sample variance of the developmental 
data 


forecast 


dependent 


Whereas MPE provides a measure of the percent bias in the 


forecasts, MAPE will always be at least as large as MPE and 


provides a measure of dispersion of the forecasts ( 


see Boger 


and Jayachandran, Ref. 23:p. 11). Comparison of R^ (fitted) 


2 


2 


and R (validation) quantitatively evaluates the relative 


variability of the forecasting ability of the model beyond 


the developmental range. 

In this study, the level of the independent variable 
beyond the developmental range is fixed. The conditional 
EE cections Of the dependent variable, VARs for the 
regression mođels for each learning curve specification are 
based on the following relation: 

l) Linear Regression Model 


a) Autocorrelation is not modeled 


^ 


LN Y, = LN Bo + B 


^ 


LN X 


te 1 İm 


^ 


Y, = exp(LN Y 


E t) 


b) Autocorrelation is modeled 
where Yu 
the devel 


value. 


Spmental data} for the initial predicted 


2) Nonlinear Regression Model 


a) Autocorrelation is not modeled 


^ ла 8 
li 
Y. = B oX 


b) Autocorrelation is modeled 


a P 8 8 
= ру + 8. (X 1 6X i 


EG . =a 


Ki» 


where Y -1 is equal to the last fitted value of the 
developmental data for the initial predicted value. 


where 


80/51: estimated parameters of the regression 
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is equal to exp (the last fitted value of 


X: independent variable of the bivariate data that 
was not used for developing the model 


estimated autocorrelation parameter 


о) 
ee 


F. COMPARISONS 

In this study, three learning curve specifications are 
being investigated: the unit learning curve, the cumulative 
average learning curve, and the Boger et. al. learning 
curve. Each specification will be fitted using both a 
Simple linear regression model and a nonlinear regression 
model. 

1. Regression Models 

The first comparison that will be investigated, 

which is of secondary interest in this study, will be the 
relative fit of each model and the differences between the 
linear regression and nonlinear regression methods, with and 
without transformations of the data for autocorrelation, for 
each learning curve specification. The approach to be used 
for these comparisons will be strictly graphical. For each 
model specification the dependent variable of the develop- 
mental data will be plotted against the observed dependent 
variable of the developmental data and each of the fitted 
variables. 

2. Learning Curve Specifications 

The basis for comparison between the unit, 


cumulative average, and the Boger et. al. learning curve» 


54 


specifications are the differences between actual cost per 
lot and each model's fitted cost per lot. Each model's 
eted Cost per lot can be arrived at through some 
relatively simple calculations using the data refinement 
procedures discussed above, applied to the results of each 
regression technique. The initial comparison of the fitted 
lot costs will be done graphically. For each model 
specification and regression technique, the observed cost 
per lot and the fitted cost per lot will be plotted against 
Che respective lot numbers tor the data within the 
developmental range. Where the difference between observed 
srd Lot costs are not obviously different by 
graphical means, a statistical test will be employed to 
attach statistical significance to the difference. The non- 
parametric test to be utilized will be the Kruskal-Wallis 
[Ref. 22:p. 229] where the populations are the different 
model specifications and regression techniques. The samples 
within each population are the absolute values of the 


differences between the observed and fitted cost per lot. 
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V. RESULTS 


A. DATA ANALYSIS 
Two sets of production data and three learning curve 
specifications for each data set were investigated in this 
research. A fairly extensive analysis was performed on the 
ceste a of each type of regression for each learning curve 
specification and each data set. The results of each 
analysis, generally, led to further modifications of the 
data calling for even more regressions and residual 
analyses. Twenty-six regressions, sixteen linear regres- 
sions and ten nonlinear regressions, were performed during 
the course of this study. For the sake of brevity, only one 
analysis for a single learning curve specification and 
production data set, which was typical of the analyses 
performed in all other cases, will be discussed at length. 
The results of the other regressions and analyses are 
tabulated in Appendices D, E, F, G, H, and I. 
l. Boger et., al. Model: C-141 Data Analysis 
The first 18 of the 24 total bivariate observations 
were selected to fit the linear regression model for the 
Boger et. al. specification of the learning curve. The 
remaining six data points were withheld for validation 


purposes. Figure l is a scatter plot of the raw data and 


56 


RAW DATA: 18 OBSERVATIONS 


DEPENDENT VARIABLE 
3x10* 4x10? 


2x10? 


1105 





LN TRANSFORMED DATA: 17 OBSERVATIONS 


12.2 


12.0 


11.8 


LN [DEPENDENT VARIABLE) 
11.6 


11.4 





5 6 7 
UN [INDEPENDENT VARIABLE] 


Figure 1. Raw Data and Ln Transformed Data 
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the natural log (ln) transformed data. The ln transformed 
data scatter plot has seventeen data points since the first 
observation of the independent variable vector was 
necessarily omitted since its value is infinity when ln 
transformed. 

The first linear regression was performed using the 
17 data points (observations 2 through 18). Inspection of 
the residuals plotted against time and against the fitted 
values, Figure 2, revealed that the residuals were not 
patternless. The systematic structure of the residuals 
implied that the residuals did not reflect the assumptions 
of the linear model. The cyclic pattern of the residuals, 
furthermore, suggested the presence of first-order auto- 
correlation and encouraged more investigation. The Durbin- 
Watson statistic derived from this set of residuals led to a 
rejection of the null hypothesis (Ho: p=0) implying statis- 
tical significance of the presence of first-order auto- 
correlation: The initial inspection of the residuals also 
addressed the question of outliers. Since no residuals were 
outside the interval specified for data rejection, no 
observations were omitted from the data set. Table IV 
highlights the results of the initial linear regression. 

Since the sample size was small in relation to the 
number of parameters being estimated, the Theil and Nagar 
estimate for the first-order maur ocorre Maeron rj, was 


utilized: The values in parentheses adjacent to the 
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RESIDUALS 


RESIDUALS 


RESIDUALS VS TIME: 17 OBSERVATIONS 





RESIDUALS VS FITTED VALUES: 17 OBSERVATIONS 





115 12.0 12/5 13.0 
FITTED VALUES 


Figure 2. Residual Plots 
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TABLE IV 


LINEAR REGRESSION 1 RESULTS 


ln 89 >: 13,332 (12302), fee 00001 


B. : -.282l (187005 oi Gece) 


DEW. — 5941 


Nee 17 
0: .6987 
R” : . 92 


ES AO E 


estimated parameters are the student's t statistics for the 
respective coefficients. 

The autocorrelation was then modeled into the ln 
transformed data resulting in Figure 3. The data point in 
the upper left hand corner seems to be a typical result when 
autocorrelation is modeled into the data using the technique 
employed in this study. A second linear regression was 
performed on these 17 observations. The scatter plot of the 
residuals plotted against time and against Һә Ел 
values, Figure 4, again, was not patternless and suggested 
the presence of autocorrelation. Due to the small sample 
Size, the first observation had a dramatic effect on the 
regression and, subsequently, the residuals. The Durbin- 
Watson statistic again reflected a statistically significant 


amount of autocorrelation present in the residuals. Further 
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AUTOCORRELATION MODELED: 17 OBSERVATIONS 


DEPENDENT VARIABLE 
6 





INDEPENDENT VARIABLE 


Figure 3. Ln Transformed Data Adjusted for 
Autocorrelation 

modeling of autocorrelation into the data yielded similar 
results. Inspection Of the probability plot, Figure 5, a 
symmetry plot of the residuals, the "rough cut" measure of 
normality (94 percent of the standardized residuals within 
the appropriate student's t value) and the Hotelling-Pabst 
Statistic (T=286, N=17) supporting constant variance did not 
Suggest major departures from the other distributional 
assumptions of the model. The results of the second linear 
regression are highlighted in Table V. 

While considerable literature exists discussing the 


the need to retain the first observation for further 
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RESIDUALS 


RESIDUALS 


RESIDUALS VS TIME: 17 OBSERVATIONS 





RESIDUALS VS FITTED VALUES: 17 OBSERVATIONS 





FITTED VALUES 


Figure 4. Residual Plots 
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TABLE V 


LINEAR REGRESSION 2 RESULTS 


in 80 Eege d aig Oe Ce gl) 
m “au ıı 00 I CC OR ) 
DA. : ..8 
Ne 217 
p: .6054 
Bö 173 
"soo 1 


NORMAL PROBABILITY PLOT 





Figure 5. Normal Probability Plot of Residuals 
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regressions after autocorrelation has been modeled into the 
data, especially when sample size is small, the second 
regression resulted in an unexpected value for Di, A third 
regression was performed after omitting the first observa- 
tion to see what effects would be seen in parameter 
estimation and prediction results. The scatter plot of the 
residuals against time, Figure 6, appear to be more 
randomly scattered in a narrow horizontal band about zero. 
Furthermore, the probability plot and histogram, Figure 7, 
and the "rough cut" measure of normality (94 percent of 
the standardized residuals within Me apptope lace student's 


t value) support the distributional assumptions of the 


RESIDUALS VS TIME: 16 OBSERVATIONS 


RESIDUALS 





Figure 6. Residual Plot 
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NORMAL PROBABILITY PLOT 





NORMAL DENSITY FUNCTION, N16 
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Figure 7. Normal Probability and Density Plots 
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model. The Durbin-Watson statistic and the test for 
homoskedasticity  (Hotelling-Pabst statistic, T=572, N=16) 
Suggested the other assumptions of the model were not 
violated. The results of the third linear regression are 


highlighted in Table VI. 


TABLE VI 


LINEAR REGRESSION 3 RESULTS 


in B. : 4.399 1-5 ви с О) 


To 
e 
' 


.4877 (27. 110285:-66.001) 


Do 20 


N : l6 
R” : .. 


The nonlinear regressions were performed using 17 
bivariate observations (2 through 18). The initial 
parameter estimates for Bo and B, Were taken from the 
results of the first linear regression. The other initial 
values required by the STATGRAPHICS nonlinear estimation 
panel used the system default values. The results of the 
first nonlinear regression are highlighted in Table VII. 
Inspection of the residuals plotted against time Fiqur? 
and the Durbin-Watson statistic led to acceptance of the 


alternative hypothesis (H po» TS 


E 
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RESIDUALS 


TABLE VII 


NONLINEAR REGRESSION 1 RESULTS 


2. 4016261 
° -.214 


Вӧ Мм: . 86 


NOS n 
8: .5629 
R7 : .96 


RESIDUALS VS TIME: 17 OBSERVATIONS 





Figure 8. Residual Plot 
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The second nonlinear regression was performed on the 
same 17 bivariate observations after autocorrelation was 
modeled, Figure 9. The results of this regression are 
highlighted in Table VITIS 

TABLE VIII 


NONLINEAR REGRESSION 2 RESULTS 


d : 307094.63 


8 Š -.382 


1 
D.W: : 2.44 
N “717 


pos 229 


. 94 


дј 


3x10* 


DEPENDENT VARIABLE 
2x10* 





Figure 9. Ln Transformed Data, Autocorrelation Trans- 
formation, First Observation Omitted 


68 


Inspection of the residuals plotted against time, Figure 10, 
revealed the residuals to be patternless and lying in a 
narrow interval around zero. While the test for constant 
variance (Hotelling-Pabst statistic, T-878, N17), the 
Durbin-Watson statistic and the "rough cut" measure of 


normality (94 percent of the standardized residuals within 


RESIDUALS VS TME: 17 OBSERVATIONS 


40000 80000 


RESIDUALS 
0 


- 80000 — 40000 





Figure 10. Residual Plot 


the appropriate student's t value) support the assumptions 
of the model, the probability and density plots, Figure ll, 
suggest major departures from the assumption of normality of 


the error term. The implications of the residuals not 
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Figure 11. Normal Probability and Density Plots 
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reflecting the assumptions of the model will be discussed in 


the structure analysis portion of the results and in the 


conclusions. 


B. VALIDATION 

The validation phase of this study consisted of a 
predictive analysis of the different model specifications 
and the regression techniques utilized. Tire initial 
investigation of the predictive ability of each case 
employed the prediction accuracy measures of MPE, MAPE, and 
the Pearson correlation coefficients adjusted for degrees of 
freedom. The results of these calculations are tabulated in 
pable IX. The predicted and fitted results of each model 
specification and regression method were transformed into 
the units of the original model specification, i.e., direct 
labor hours for the Xth unit for the unit learning curve, 
average cost per unit for the cumulative average learning 
curve, and the average cost in direct labor hours for the 
units produced in time period t for the Boger et. al. 
learning curve, prior to calculating the prediction accuracy 
measures. While the results for a model specification are 
comparable over the various regressions performed, the 
results are not directly comparable across model 
Specificatrionse 

The negative values for MPE reflected that the initial 


regression, “linear or nonlinear, for each specification 


T 


TABLE IX 


PREDICTION ACCURACY MEASURES--ENTIRE HOLDOUT SAMPLE 


R^ (fitted) R (validata 


na n, MPE MAPE 
L Boger 1 17 5 7:5: / 2.56 SE 1 0510) 
L Boger 2 17 5 59.46 59.46 15 lee 2052 
L Boger 3 16 5 61.58 oo “047 .077 
NL Boger 1 17 5 -89.42 89.42 + DE 2:284 
NL Boger 2 17 5 38.20 — 38.20 «DOU xs. 
E L Cum ! 174 29 -4.08 4.08 e SWÈ 2099 
S NL Cum] 174 29 -6.40 6.40 .979 .999 
È NL Cum 2 174 29 51.03 51.03 .174 -1.530 
¿NL Cum 3 173 229. EET a, s .018 -1.536 
L Unde l 173 D 1.64 5.78 .876 .914 
L Unit 2 175 D 57079 57075 оба 912 
L Unit 3 172 29 67.43 67.43 . 863 55: 
NL Unit 1 173 29 20.70 5.81 -992 2010 
L Boger 1 17 6 1550 5501 7457 «39 
L Boger 2 17 6 07.39 10705007 10595 "923 
L Boger 3 16 6 66329 66029 «087 zb 
ie NL Boger 1 17 6 -29.44 29.44 “868 . 996 
S NL Boger 2 17 6 61.30 61.30 .948 .960 
® — L Cum] 9 3 O .980 .999 
О Ы Сип 2 9 3 46.32 46.32 -13.96 .916 
L Cum 3 8 3 66.38 66.38 -.481 27 
NL Cum 1 9 3 EGET ə ə 986 227 
L Unit 1 J 3 12092 RAZ .376 961 
NL Unit 1 9 3 4.43 6:12 8995 2 
vhere Ng ` number of developmental data points 
n : number of predicted data points 
L : linear regression model 
NL : nonlinear regression model 
Boger :  Boger et. al learning curve specification 
Unit : Unit learning curve specification 
Cum : Cumulative average learning curve specification 
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overestimated the actual costs. On the other hand, after 
Ewes EranstOrmation for autocorrelation was performed, the 
models usually underestimated the actual costs. The most 
striking feature of this table is the extremely large values 
of percent error after autocorrelation was modeled. This 
implied the predicted values severely underestimated the 
actual costs and could have been caused by predicting values 
too far outside the range of the developmental data. When 
the first observation was omitted following the adjustment 
for autocorrelation, the predictions were slightly more 
biased--but not by a large amount. Whereas the MPF for the 
Boger et. al. model, F-102 data, implied the model did not 
. predict well at all; the MPE for the Boger et. al model, 
C-141 data, reflected excellent predictability. The Boger 
et. al model, F-102 data, MPE was not at all consistent with 
the MPE values for the unit and cumulative average learning 
curves using the F-102 data. Conversely, the Boger et. al. 
model, C-141 data, MPE was consistent with the results of 
the other specifications using the C-141 data. This obser- 
vation could be due to unrealistic refinements to the data 
or the difference in sample size. After the transformation 
of the data for autocorrelation was made, the predicted 
values of the Boger et. al. model for both the C-141 anc 
F-102 data were extremely high but consistent with the 


results of the other specifications. Another result that 


7 


was the MPE values for each of the nonlinear regressions 
(with no adjustment for autocorrelation) were larger than 
the respective linear regressions. 

In most cases, MAPE was the absolute value of the 
respective MPE value. This implied that the models 
generally did not produce predictions that bracketed the 
actual values but rather predicted costs that were 
consistently either above or below the actual costs. 

Prior to the data being adjusted for autocorrelation, 
the 22 (fitted) and R” (validate) values vere in the 
interval (.75, .99) except for the Boger et. al. model for 
the F-102 data. Vhile the Boger et. al. linear and non- 
linear models, C-141 data, had slightly larger differences 
of R” Square values than the other specifications 
(reflecting slightly more variability in prediction results) 
the Boger et. al. linear and nonlinear models, F-102 data, 
reflected extremely high variability of the fitted and 
predicted residuals relative to the variability of the 
dependent variable of the development data--which is not a 
desirable trait of a model. Negative values for R” are 
indicative of cases vhere the sample variance of the 
residuals are higher than the sample variance of the 
developmental dependent variable. In all cases, vhen the 
autocorrelation transformation vas incorporated, the R 

2 


squared values decreased and the differences betveen R 


(fitted) and R” (validate) grev larger. 
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The same prediction accuracy measures were calculated 
for predicted values not as far outside the developmental 
data range. These results are also tabulated in Table X. 
Whereas the MPE and MAPE values decreased slightly (except 
for the Boger et. al. model, F-102 data), the R” values 
remained pretty much unchanged. The same trends described 
for the previous table apply to this table also. The 
implication of the results reflected in this table of 
calculations was the range of the predicted values outside 
the developmental range and had little effect on the initial 


prediction accuracy measures. 


p. STRUCTURAL ANALYSIS 

In most cases, the error process of the linear and 
nonlinear statistical models did not exhibit the desired 
normally distributed, random structure but, instead, 
exhibited a structure in which the error between adjacent 
observations were related to each other. As discussed 
above, the presence of autocorrelation in the residuals of a 
model results in biased estimates of the standard errors of 
the regression coefficients. Hence, the standard t-tests 
for significance of the difference of the estimates of the 
regression coefficients from zero, and the coefficients of 
determination may be erroneous. 

In all cases where the Durbin-Watson test for 


autocorrelation resulted in accepting the alternative 
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hypothesis (Hl: p > 0), this problem was addressed by 
modeling this phenomenon into the data and performing 
subsequent regressions. In every case, the R” value of the 
regression decreased after modeling AR [1] and then 
increased after the first observation was omitted. 
Similarly, the t-statistics followed the same trend, and, in 
all cases, the estimated coefficients were statistically 
Significant. The statistical significance of the estimated 
coefficients and the R” values (listed in Appendices D, E, 
F, G, H, I) indicated that there is indeed a good amount of 
information contained in, and a good deal of the variation 
is explained by, the regression model. 

After modeling the autocorrelation into the data and 
performing follow-on regressions, the nature of the 
residuals changed. The initial regression usually generated 
results that had a distinct cyclic pattern. The follow-on 
regressions reflected a linear pattern in two cases, but 
always a non-cyclic pattern--usually patternless. 

In all cases after autocorrelation was modeled, the 
residuals also appeared to be and were statistically 
verified to be homoskedastistic. eher distributional 
observations were made. In the small sample sizes (N=9, 
257777. unit ала си ии average models), the 
residuals of the follow-on regressions, both linear and 


nonlinear, met the "rough-cut" requirements for normality. 
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These normal assumptions were further reflected in the 
probability and symmetry plots and the estimated third and 
fourth moments. In the mid-sized samples (N=17, C-141 and 
F=102 Boger et. al. data), the residuals of the follow-on 
regressions reflected the normal assumptions through the 
"rough-cut" requirements, the probability and symmetry plots 
and the estimated third and fourth moments (except for the 
C-141 nonlinear regression for the Boger et. al. data). 
While the "rough-cut" requirements were met for the large 
sample sizes (N=173, F-102 unit and cumulative average 
data), the probability and symmetry plots and the estimated 
third and fourth moments suggested that major departures 
from the assumptions of normality existed. These 
inconsistent observations may be caused by either the 
differences in sample sizes, adjustments that were done to 
the data or poor models. It also appeared that the "rough- 


cut" measures of normality were not very discriminating. 


bir COMPARISON OF FITTED MODELS 

One of the secondary aspects of this research vas to 
graphically compare the fitted models, both linear and 
nonlinear, against the observed developmental data in the 
units of the original models. 

The fitted model results for the Boger et. al. model, 
linear and nonlinear regressions, C-141 data, are plotted in 


Flgure 12. The observed independent variable of the 
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developmental data is plotted against the observed and 
fitted dependent variable values. As discussed above, the 
units for each fitted model have been transformed into the 
units of the original model. The initial linear regression 
with no autocorrelation modeled into the data, surprisingly, 
has a better fit than its nonlinear counterpart. After the 
transformation for autocorrelation was performed, however, 
the linear model had a poor fit while the nonlinear 
regression had an excellent fit. Whereas a third nonlinear 
regression was not performed, the linear regression with 
autocorrelation modeled and dropping the first observation 
had a poor initial fit but an excellent fit for the latter 
part of the developmental data range. 

The remaining fitted models are listed in Appendix J. 
Generally speaking, the observations of each fitted model 
and regression technique were consistent across both sets of 
data. Prior to the adjustment for autocorrelation, both the 
linear and nonlinear regressions were comparable (except in 
the case of the Boger et. al. model, F-102 data).? This vas 
a surprising result since one vould expect the nonlinear 
regression to have a much better fit than the linear 
regression for nonlinear data. 

After the transformation for autocorrelation was made, 
the fitted linear models appeared to fit poorly. On the 


other hand, the fitted nonlinear models, while not as good 
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as the model prior to the adjustment for autocorrelation, 
appeared to have better fits than their linear counterparts. 

Finally, after the initial observation was omitted 
Following the transformation for autocorrelation, an 
interesting observation was noted. In all cases, the fitted 
model--both linear and nonlinear--was poor for the initial 
portion of the developmental data but appeared to be an 
excellent fit for the latter portion of the” developmental 


data. 


i COMPARTSON OF FITTED LOT COSTS 

The cost for each lot derived from the fitted models for 
each of the regressions performed for both the C-141 and the 
F-102 data are plotted against the observed cost per lot in 
Appendix K. The fitted lot costs for the C-141 data are 
plotted for lots two through eight. Only these seven lots 
are plotted and used for comparison since omission of data 
points in some regressions and production data for a lot 
lying outside the developmental data range result in 
incomplete fitted lot costs. The fitted lot costs for the 
C-141 are plotted against the respective observed lot costs 
Is 55575 PI. 26/71 in eseb plot. The fitted lot costs for 
the F-102 data are plotted for lots four through nine for 
the same reasons cited above. The observed lot costs for 
the F-102 data are not the same for each plot since some 


outliers were initially identified and omitted (not always 
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the same points) prior to performing the regression. 
Inclusion of these outliers in the calculation of observed 


lot costs, in some cases, would bias the fitted lot costs 


down. 


Visual inspection of tne fitted lot costs plots, 
Appendix K, gives a good impression of the fit of each 
specification of the learning curve to the lot costs. Since 
each specification has been translated into fitted costs per 
lot, a basis exists for comparison across regression 
techniques and learning curve specifications. Figure 13 is 
an example of one plot of the fitted costs per lot for the 


Boger et. al. model, nonlinear regression, C-141 data. 
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Figure 13. Fitted Lot Costs Results: Boger et. al. Model, 
Nonlinear Regression, C-141 Data 
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Whereas the initial regression does not appear to provide a 
good fit to the observed data, the fitted costs per lot 
after autocorrelation was modeled has a much better fit. 

In general, the unit learning curve, both linear and 
nonlinear regression techniques, w r and TWI thout 
transformations for autocorrelation, provided the best 
fitted lot costs for the F-102 production data. With 
respect to the cumulative average learning curve specifi- 
cation, the linear and nonlinear regressions without 
transformations for autocorrelation appear to have excellent 
lot cost fits--not as good as but comparable to the unit 
specification fits. The fitted lot costs for the cumulative 
average model, nonlinear regression with the transformation 
for autocorrelation, appear to have reasonable fits--but not 
as good as their unit specification counterparts. Whereas 
the linear regression for the Boger et. al. model appears to 
baye o better fit than its nonlinear counterpart (except 
vhen autocorrelation is modeled) and a good fit overall, the 
fitted lot costs do not compare favorably vith the cumula- 
tive average and the unit learning curve specifications. A 
nonparametric statistical test was then performed comparing 
the linear and nonlinear regression results, no auto- 
correlation modeled, of all three models. The purpose of 
e test was to statistically compare the fitted lot costs 


for each model. As discussed above, the Kruskal-Wallis test 
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was performed using the vectors of differences between the 
fitted lot costs and observed lot costs for the different 
models as the treatments. The null hypothesis (each model 
tends to yield identical residual lot costs) was rejected 
with the Kruskal-Wallis test statistic T = 19.07, 5 degrees 
of freedom, .001 < a < .005. Multiple comparisons were then 
performed between models with a = .05, 30 degrees of 
freedom. At this level, the Boger et. al. model, both 
linear and nonlinear regression results, tended to yield 
larger residual lot costs than both the unit and cumulative 
average models. The cumulative average and unit learning 
curve specifications tended to yield residual lot costs that 
were statistically equal. 

With respect to the C-141 data, the unit learning curve 
specification, linear and nonlinear regressions, appear to 
have excellent fitted lot costs--seemingly better than the 
cumulative average and Boger et. al. specifications. The 
linear and nonlinear fitted lot costs of the cumulative 
average and Boger et. al. models, contrary to the F-102 
data, compared favorably. A nonparametric statistical test 
was then performed comparing the linear and nonlinear 
regression results, no autocorrelation modeled, of all three 
models. The purpose of the test and data description are 
the same as above. The null hypothesis was rejected with 


the Kruskal-Wallis test statistic T = 13.22, 5 degrees of 
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freedom, with a = .05 and 36 degrees of freedom. Ари һис 
level, the unit specification, linear regression, tended to 
yield smaller residual lot costs. All the other models 
ry e tO Yreld Statistically equal™ residual lot costs. 
Generally speaking, the linear models, with the 
transformation for autocorrelation performed, resulted in 
very poor lot cost fits for both sets of data. on the other 
hand, the nonlinear regressions with autocorrelation modeled 
resulted in reasonable fits. Sil when the first 
observation was omitted after modeling autocorrelation, the 
meed lot costs were reasonable for both the linear and 


nonlinear regression techniques. 


85 


VI. CONCLUSIONS 


The primary purpose of this research was to empirically 
investigate the validity of a reformulation of the 
cumulative average learning curve derived and discussed by 
Boger, Jones and Sontheimer in "Budgets, Contracts, 
Incentives and Costs: A Stylized Nexus" [Ref. 14:p. 23]. 
In the process of conducting this investigation, the impacts 
of linear versus nonlinear regression methods and modeling 
autocorrelation were also addressed. 

The linear and nonlinear Boger et. al. models for both 
sets of data, before autocorrelation was modeled, while not 
as good as the fitted cumulative average and unit learning 
curve models, did not suggest gross inadequacies. 
Similarly, the fitted cost per lot for the Boger et. al. 
model, while statistically different from the cumulative 
average and unit specifications for the F-102 data, was not 
Statitically different from the cumulative average model for 
the C-141 data. Again, the plots of the fitted costs per 
lot did not suggest gross inadequacies of the Boger et. al. 
model. 

Surprisingly, it was also noted that the nonlinear 
regressions did not consistently provide much better fitted 


models and fitted lot costs. Also, in agreement with other 
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literature and research, the em ae learning curve 
specification generally provided better fitted models and 
Mared ilot costs than both other models. 

The predictive ability of the Boger et. al. model for 
the C-141 data was consistent with the cumulative average 
and unit specification. This was not true for the F-102 
data and is partly blamed on the noise in the data in the 
case of the Boger et. al. model. 

Whenever autocorrelation was modeled into the data, 
poorly fitted lot costs emerged in the linear regression 
cases. On the other hand, when autocorrelation was modeled 
during the nonlinear regressions, the results were not 
Substantially degraded. The predictive ability of all 
models was adversely affected when the autocorrelation was 
modeled. Areas for further research would include other 
methods of autocorrelation modeling and the effects that 
other estimates of p might have. 

While the structure of the residuals did not always 
reflect the assumptions of the model being analyzed, which 
might lead one to consider rejecting the model, Pesaran 
cautions: 

There 1S not theoretical justification for expecting a 
correctly specified model to possess all the 
characteristics of the classical regression models. The 
assumptions underlying the classical regression models are 
made, not because they are optimal from the point of view 
of economic theory, but because they are extremely 


convenient for estimation and hypothesis testing purposes. 
fret. 24:p. 154] 
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While observing some contradictory results between the 
two sets of aircraft production data, this researcher feels 
that the results generally suggest the Boger et. al. 
learning curve specification is an adequate model. This 
conclusion is tempered by several observations. It is felt 
that the C-141 and F-102 data used was a severe limitation 
to the scope of this study. While the sample size of the 
F-102 data was generally large enough for the analysis, the 
adjustments made to the data to meet the form required by 
the Boger et. al. model (discussed in detail by Womer and 
Gulledge (Ref. 12:p. 81]) are rough approximations and have 
introduced considerable noise into the data. On the other 
hand, whereas the data for the C-141 analysis appeared to be 
very smooth, the small sample size was a limitation. This 
researcher feels that a more conclusive analysis could be 
performed with considerably more effort going into the data 
gathering stage with dialogue between the analyst and the 
data source. Finally, the adjustments made to the data for 
the Boger et. al. model in this study used equivalent units 
produced per time period based on approximate production 
rates to generate the independent and dependent variables. 


Other proxy variables might also be worth investigating. 
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APPENDIX A 
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APPENDIX B 


ADJUSTED UNIT LEARNING CURVE DATA 
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APPEND LC 


ADJUSTED BOGER ET AL LEARNING CURVE DATA 
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APPENDIX D 


BOGER ET AL MODEL:  C-141 DATA ANALYSIS RESULTS 


RAW DATA: 18 OBSERVATIONS 


2x10* 3x10* 4x10? 


DEPENDENT VARIABLE 


1109 





LN TRANSFORMED DATA: 17 OBSERVATIONS 


12.2 


12.0 


11.8 


11.6 


LN [DEPENDENT VARUBLE] 


11.4 





5 6 7 
UN [INDEPENDENT VARIABLE] 


Figure D-1. Boger et al Specification: C-14l Data 
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APPEND TOTE 


UNIT LEARNING CURVE: С-141 DATA ANALYSIS RESULTS 
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Figure E-l. Unit Learning Curve:  C-141 Data 
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APPENDIX F 


CUMULATIVE AVERAGE LEARNING CURVE:  C-141 DATA ANALYSIS RESULTS 
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Figure F-1. Cumulative Average Learning Curve: С-141 Data 
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APPENDIX G 


BOGER ET AL MODEL:  F-102 DATA ANALYSIS RESULTS 


RAW DATA: 18 OBSERVATIONS 
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Figure G-l. Boger et al Specification:  F-102 Data 
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APPENDIX H 


UNIT LEARNING CURVE:  F-102 DATA ANALYSIS RESULTS 
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H-1. Unit Learning Curve:  F-102 Data 
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CUMULATIVE 


Figure MA: 


APPENDIX I 


AVERAGE LEARNING CURVE: F-102 DATA ANALYSIS RESULTS 
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Cumulative Average Learning Curve:  F-102 Data 
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